Segregations in 14 U.S. Cities

Sunho Kim

University of Washington Information School

1. Introduction

This report measures the segregation level of 14 different cities in the United States. The data were collected from U.S. Census Bureau, and the levels of segregation were measured by 3 different metrics which are Dissimilarity, Isolation, and Correlation. In addition to 3 metrics, this report proposes a new metric that could help understanding segregation in a different view.

1-1. Data

Each city is divided by the minimum size used by U.S. Census, and each of those areas comprises following data: GEOID, GEO.display.label, pop, pop.white, pop.not.white, pct.white, and pct.not.white. The metrics for measuring segregation levels used populations of white, non-white, and total.

1-2. Cities

2. Metric Definition

2-1. Dissimilarity

Dissimilarity index is the most widely used for measuring segregation in terms of evenness. When the population of majority and the population of minority are evenly distributed, the segregation of the city is considered to be the smallest. Dissimilarity measures percentage of population that needs to relocate to have the same ratio of majority and minority in all sub-areas. As the dissimilarity index approaches to 0, there are less needs to relocate people which means there are less segregation. The formula for dissimilarity is:
\[\frac{\sum_{i=1}^{n} [t_i|(p_i - P)|]}{[2TP(1-P)]}\] Where: 

2-2 Interaction

Interaction index is used to measure the segregation in terms of exposure. As the interaction index decreases, there are higher segregation because the interaction index measures the exposure of minority person to majority person “as the minority-weighted average of the majority proportion of the population in each areal unit”. Simply this index represents the more interactions, the less segregation. The formula for the interaction index is:
\[\sum_{i=1}^{n} [\left( \frac{x_i}{X}\right)\left(\frac{y_i}{t_i}\right)]\] Where:

2-3 Isolation

Isolation index is used to measure the segregation in terms of exposure like the interaction index. The difference is that isolation index measures “the extent to which minority members are exposed only to one another,” so it indicates higher segregation as the index increases. Simply the index represents isolation, and more isolation indicates more segregation. The formula for the isolation index is:
\[\sum_{i=1}^{n} [\left( \frac{x_i}{X}\right)\left(\frac{x_i}{t_i}\right)]\] Where:

3. Metric Computation

3-1. Dissimilarity

Using the dissimilarity index, it shows that Oklahoma City is the city with the least segregation. Oklahoma City has the lowest dissimilarity index which means that it has the least segregation. Here is a table for levels of segregation in different cities sorted by the dissimilarity index. More the upper a city is, the less segregation exists.

3-2. Interaction

Using the interaction index, it shows that Denver is the city with the least segregation. As Denver has the highest interaction index, Dever becomes the city with the least segregation. Here is a table for levels of segregation in different cities sorted by the interaction index. Cities tend to have less segregation levels as they go up on the table.

3-3. Isolation

Using the isolation index, it shows that Denver is the city with the least segregation. Denver has the lowest isolation index which makes it the city with the least segregation. Notice that the order remains the same with interaction index even though the values of index were changed. Here is a table for levels of segregation in different cities sorted by the isolation index. Cities tend to have less segregation levels as they go up on the table.

4. Metric Comparison

Comparing the indexes measured with different metrics, we see that the result of the city with the least segregation changes. For each metric, least segregated cities are Oklahoma City(dissimilarity index), Denver(interaction index), and Denver(isolation index). Also for each metric, most segregated cities are Milwaukee(dissimilarity index), Baltimore(interaction index), and Baltimore(isolation index). Depending on the metric being used, we see different rankings of segregation. Different metrics focus on different aspects of segregation and different formula, and that is why we see differences in the ranking. That means different matrics represents different understanding of the cities. We have the same order in interaction index and isolation index, and that is because those two indexes are counter-balanced. That is why they have the same order but different index value.

When putting all indexes together, it becomes an ugly graph; hoever, we can still find some patterns. For example, the blue dash line has a similar pattern with the red solid line, and blue dash line goes opposite of blue solid line. We will discuss those blue lines soon, but here is the overall graph that shows different metrics with different index values:


Similar pattern


opposite pattern


As mentioned before, interaction index and isolation index are counter-balancing. They goes opposite. So, when we add those two index values, we get 1 as the following table:

To see more clearly, check out the following bar chart. You will see that when Isolation Index and Interaction Index are added, it becomes always the same number which is 1.

Interaction index and isolation index are counter-balancing, and the significance of each index goes the opposite direction. For example, high isolation index means more segregation while high interaction index means less segregation. As a result, we can see that the segregation level ranking of interaction index and isolation index are the same while they are different than the ranking of dissimilarity index:

5. Metric Proposal

Here is a proposal for a new metric. This new metric tries to find out the difference between the number of majority and the number of minority and how much portion it is compared to the total population. The graph goes similarly with interaction index, and it is assumed that lower indexes are more likely to be segregated. The advantage of this new metric is easiness, and the disadvantage is that it may not be very accurate since it measures the overall. The formula for the new metric is:
\[\frac{|Y - X|}{T}\]

Where:

When we use the new metric, we can get the following segregation ranking:


We can compare the new index with all of previous indexes with the following graph:


Here is a new graph that compares with interaction index:


There are more needs to deeply dive into the metrics, and further research on the segregation level is necessary. Thank you for reading.